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Abstract 

A tandem cell is proposed for DNA sequencing in which an exonuclease enzyme cleaves bases 
(mononucleotides) from a strand of DNA for identification inside a nanopore. It has two 
nanopores and three compartments with the structure [cisl, upstream nanopore (UNP), 
trans\=cis2, downstream nanopore (DNP), trans!]. The exonuclease is attached to the exit side 
of UNP in transXIcisl. A cleaved base cannot regress into cisl because of the remaining DNA 
strand in UNP. A profiled electric field over DNP with positive and negative components slows 
down base translocation through DNP. The proposed structure is modeled with a Fokker-Planck 
equation and a piecewise solution presented. Results from the model indicate that with 
probability approaching 1 bases enter DNP in their natural order, are detected without any loss, 
and do not regress into DNP after progressing into transl. Sequencing efficiency with a tandem 
cell would then be determined solely by the level of discrimination among the base types inside 
DNP. 

Keywords: nanopore, DNA sequencing, exonuclease, Fokker-Planck equation, tandem cell 

I INTRODUCTION 

The success of the Human Genome Project has seen increased efforts aimed at efficiently 
sequencing whole genomes. A variety of methods have been developed for high throughput 
sequencing of DNA molecules of varying lengths based on a wide range of technologies that 
may be label-based or label-free. Most of them require considerable amounts of preprocessing, 
preparation, and reagents. Comprehensive information about these methods can be found in 
several reviews [1,2]. 

Nanopore-based methods use a minimal amount of preparation and rely primarily on the 
detection of a current in the pore when the analyte (usually a DNA molecule or fragment thereof, 
but more generally any polymer) passes through. Such sequencing is efficient in principle and 
leads to a device footprint that is smaller than in the other NGS methods. Nanopore sequencing 
has been reviewed in detail recently [3, 4]. 

The present work proposes a nanopore-based sequencing structure that has two nanopores in 
tandem and an exonuclease enzyme that is attached to the output side of the first nanopore and 
cleaves bases from an ssDNA molecule that has threaded through the first pore. The cleaved 
bases pass through the second pore and are detected there. A profiled voltage applied across the 
second pore slows down translocation of the base thus reducing the detection bandwidth 
required. The structure is modeled with a Fokker-Planck equation and a piecewise solution 
presented. Using results from recent experiments with exonuclease-based sequencing [5] in 
which single bases are found to cause blockade levels unique to a base type, the physical model 
shows that in theory a tandem cell is capable of efficiently sequencing a single-stranded DNA 
molecule. 

II NANOPORE-BASED SEQUENCING OF DNA 



1 



Downloaded from http://biorxiv.org/on September 18, 2014 



Figure la shows the basic structure of a conventional nanopore-based electrolytic cell. Negative 
(positive) ions flow through the pore from cis {trans) to trans {cis) under the influence of a 
positive (negative) potential difference. This structure can be used to identify the sequence of 
bases in a strand of DNA. 

In 'strand sequencing' (Figure la) a strand of DNA is introduced into cis, and the negatively 
charged biomolecule is drawn into the pore. Successive bases (A, T, C, G) passing through the 
pore are identified by the current blockade level [6]. The level of discrimination depends on the 
length and width of the pore and the speed with which the strand translocates through the pore. 

In 'exonuclease sequencing' (Figure lb), an exonuclease enzyme in cis next to the vestibule 
of an AHL nanopore cleaves single bases in ss-DNA and drops them into the pore where they are 
identified by the current blockade level [5]. This method has a number of problems such as 
cleaved bases diffusing back into the cis compartment to be 'lost' or called out of order, two or 
more cleaved bases occupying the pore at the same time, and excessive translocation speeds. The 
method has been modeled mathematically and analyzed in detail [7]. 

Pores may be biological or synthetic. Biological pores include AHL [8] and MspA [9]. 
Synthetic pores may use silicon nitride (Si3N 4 ) and/or graphene [10-12]. In the latter case 
changes in the lateral current flowing through the sub-nanometer thick graphene are used to 
uniquely identify bases when ssDNA translocates from to cis to trans through a nanopore in the 
graphene [13]. Other synthetic pores studied include 'DNA transistors' [14] and silicon-based 
gated MOSFET-like structures [15, 16]. 

In the following sections a tandem nanopore structure based on the exonuclease-based DNA 
sequencing method described above [5] is proposed and analyzed theoretically. 

Ill TANDEM CELL: PHYSICAL STRUCTURE AND RATIONALE 

A tandem cell (Figure 2) consists of two conventional electrolytic nanopore-based cells 
connected in tandem and an exonuclease enzyme in between. It has MspA or AHL for the first 
(upstream) pore (UNP) with the exonuclease enzyme covalently bonded to it on the trans side, a 
silicon microchannel for transllcisl, and a solid-state pore for the second (downstream) pore 
(DNP). A potential difference is applied through electrodes situated at the top of cisl and the 
bottom of trans! . The enzyme cleaves bases from a single-stranded DNA (ssDNA) molecule that 
has translocated through UNP, and the cleaved bases are detected during their passage through 
DNP. 

Formally the tandem cell can be written as [cisl, UNP, trans\=cis2, DNP, trans2], a pipeline 
of five sections numbered 1 through 5, with 0 referring to the top of cisl. The following notation 
is used: Vy is the potential difference between sections i and j, Lm i the length of section i, Em i 
the electric field across it, and Vm i the drift velocity. V! is the applied voltage at the bottom of 
section i (or the top of section i+1) and also represents a physical electrode. V 0 is the voltage at 
the top of cisl, asssumed without loss of generality to be held at ground. Minor variations are 
used as convenient. 

Rationale 

The tandem cell requires a single DNA strand to be captured in the mouth of UNP, thread 
through UNP, and present itself to the exonuclease enzyme on the trans side of UNP 
{transit cisl) to be cleaved. Capture-threading has been experimentally found to always occur if 
the voltage across the pore is sufficiently large, see Figure 7 in [3]. This can be realized in 
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practice with the tandem cell for the following reason. The bulk of the applied voltage across the 
cis and trans compartments of a conventional cell drops across the pore, with only about 1% 
over the cis and trans compartments [5, 7]. In a tandem cell, for an applied voltage V 0 5 = 0.4 V 
assume that 49.5% of the voltage drops across each of the two pores. This means -198 mV 
across UNP, which is sufficient to ensure capture-threading of the DNA in and through UNP. 
For this reason in the model below it is assumed that ssDNA is always captured in the mouth of 
UNP, threads through UNP to the top of transit cisl for cleaving by the enzyme, and is delivered 
to DNP for detection. 

For sequencing to be accurate it is necessary (in a statistical sense) that: a) cleaved bases 
arrive at and be captured by DNP in their natural order; b) DNP identify each and every base as 
it passes through; and c) the detected base exit DNP without regressing. 

All these conditions are satisfied in the tandem cell. This statement is justified informally 
below, with a more formal quantitative analysis provided in Section IV. 

1) The leading base of the ssDNA is cleaved by the enzyme on the transl/cis! side of UNP at a 
rate that varies from one every 10 msecs [5, 7] to one every 80 msecs [17]. Thus on the average 
cleaved bases are separated in time by at least 10 msecs. Now consider a cleaved base drifting 
toward DNP under the influence of V 05 . With 99% of the applied voltage dropping across the two 
pores, the remaining 1% of the applied voltage of 0.4 V drops across cisl, transl/cis!, and 
trans!. Assuming for simplicity that it is divided among them in the proportion 3:4:3 (the three 
compartments are roughly of the same height and contain the same electrolyte), there is a drop of 
1.6 mV across transllcisl. With a length of 1 um and mobility \i = 2.4 x 10" 8 m/volt-sec, the 
electric field is 1600 V/m and the drift velocity is 3.84 x 10" 5 m/sec. The mean translocation time 
for a cleaved base to drift-diffuse through transllcisl is then about 25 msecs. If the spread 
(standard deviation) of the translocation time is not too high then successive bases (which are 
cleaved at the rate of 1 every 10 msecs) will not enter DNP out of order. Also a cleaved base 
cannot regress into cisl because the remaining DNA strand blocks its passage. 

2) With -198 mV dropping across DNP, DNP length of 10 nm, and base mobility u = 2.4 x 10~ 8 
m 2 /volt-sec (assumed to be the same for all base types), the mean translocation time through 
DNP is about 20 nsecs, which is much less than the time between two successive bases arriving 
at DNP (-10 msecs). Thus two bases do not occupy DNP at the same time. 

3) A detected base exits into trans! under the influence of V 0 5. With 198 mV across DNP (which 
is close to the the optimum value of 180 mV given in [5]) an exiting base will not regress into the 
pore from trans! . The likelihood of regression can be further reduced by providing a reinforcing 
drift field across trans! . Thus two electrodes may be used in trans! one at the top and the other 
at the bottom and a voltage difference applied across them (Figure 3). (In this case the height of 
trans! can be larger than 1 um.) 

From 1) through 3) one can conclude that with a tandem cell bases enter DNP in the correct 
order (with no more than one occupying DNP at the same time), are not skipped, and having 
passed through do not regress into DNP. Assuming that DNP correctly identifies each base (that 
is, blockade levels are sufficiently discriminating of base type), the tandem cell can effectively 
sequence a single strand of DNA. 

IV ANALYTICAL MODEL 

The behavior of a cleaved base as it translocates through transllcis! and DNP can be studied via 
the trajectory of a particle whose propagator function G (x,y,z,t) is given by a linear Fokker- 
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Planck (F-P) equation in three dimensions. Such methods are commonly used in the study of 
translocation of biomolecules through a nanopore [7]. The F-P equation is used for piecewise 
analysis of the propagator in two sections: transllcisl and DNP. Each section is modeled 
independently in its own coordinate system and the transition occurring at the interface between 
the two stages is studied separately. The coordinate systems used are shown in Figure 4. 
Standard methods from partial differential equations and Laplace transforms are used [18, 19]. 

Translocation through DNP {Detection) 

A one-dimensional approximation is applied to DNP (Figure 4a). The trajectory of the cleaved 
base as it passes through DNP is described by the function G(z, t) which satisfies 

dG/dt + v z dG/dz = D 8 2 G/dz 2 z G [0, L=L 34 ] (1) 

with initial and boundary value conditions 
I.V. The particle is released at z = 0 at time t = 0: 

G(0, t = 0) = 8(z) (2) 



B.C. 1 The particle is captured at z = L: 

G(L,t) = 0 (3) 
B.C. 2 The particle is reflected at z = L: 

D5G(z,t)/5z| z=0 = v 2 G(z,t) (4) 

Here v z , the drift velocity through DNP, is given by v z = uV 3 4/L, and (a, is the nucleotide mobility 
(assumed to be the same for all four types). Following standard procedures cp(t), the pdf of the 
first passage time (translocation time) for a particle to diffuse-drift from z = 0 to z = L and get 
absorbed at z = L, can be obtained as 

cp(t) = (2/(V(7i4Dt 3 ) [ X^o" ((2k+l)L + v z t) exp(-((2k+l)L + v z t)7(4Dt)) + Xk=o°° ((2k+l)L 
- v z t) exp(-((2k+ 1 )L + v z t)7(4Dt)) (5) 

cp(t) can be computed numerically but the series oscillates and converges very slowly. Therefore 
an alternative closed-form approach based on the earlier referenced model [7] of exonuclease- 
based sequencing. In that model a base is assumed to be cleaved above the pore of a 
conventional cell and drop into the pore. This results in a non-zero probability of a cleaved base 
not entering the pore (given by a rate constant k) and getting lost to diffusion. By setting k to 0 
the model is reduced to the boundary value problem in Equations 1-4. Also unlike in [7] where 
the drift velocity is assumed to be always oriented in the downstream direction (cis to trans), 
here it is assumed that the drift velocity v z can be positive or negative. 

Of interest here is cp(t), the pdf of the first passage time T (the time for a cleaved base to 
translocate through the pore and be detected at the end of its translocation), which is independent 
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of the coordinate system. Modifying the main result in [7] the Laplace transform of the first 
passage time of a cleaved base passing through DNP is 

(p*(s) = exp(a/2) / [cosh(y) + a/2 sinh(y)/y] (6) 

where 

a = v z L/D; y 2 = V(a 2 /4 + 2xs); x = L 2 /2D (7) 
The mean E(T) is 

E(T) = -dcp*(s)/ds | s=0 = (L 2 /Da)[l - (1/a) (1 - exp(- a))] (8) 
Similarly, the second moment E(T 2 ) can be obtained as: 

E(T 2 ) = d 2 cp*(s)/ds 2 | s=0 = 2(L 2 /Da 2 ) 2 (a 2 /2 + 3a exp(-a) - 2 + exp(-a) + exp(-2a)) (9) 
From here the variance a 2 (T) = E(T 2 ) - E 2 (T) is obtained as 

a 2 (T) = (L 2 /Da 2 ) 2 (2a + 4a exp(-a) -5 + 4 exp(-a) + exp(-2a)) ( 1 0) 

where a is the standard deviation. 

For v z = 0, these three statistics are given by 

E 0 (T) = L 2 /2D E 0 (T 2 ) = (5/12) (L 4 /E> 2 ) o 0 2 (T) = (1/6) (L 4 /D 2 ) (11) 

Figure 5 shows the mean and standard deviation of T for different voltages across a DNP 
with L = 10 nm, D = 3 x 10" 10 m 2 /sec, and n = 2.4 x 10" 8 m 2 /volt-sec. In [5] the optimum potential 
difference across the nanopore for detecting a base dropped into the pore of a conventional cell is 
noted as 0.180 V. For this value the mean and standard deviation of the translocation time are on 
the order of 10" 8 sec. The resulting bandwidth is very high, on the order of 40 MHz. (Figure 5 
shows translocation statistics for both positive and negative values of the voltage across DNP. 
Later in this section the possibility of slowing down the translocating base and thereby reducing 
the bandwidth requirement using a negative voltage across part of DNP is considered.) 

Translocation through transit cis2 {Delivery) 

For simplicity the transit cis2 compartment is assumed to be a rectangular box-shaped region 
(Figure 4b). A particle is released at the top and translocates to the bottom of the compartment 
where it is 'absorbed'. 'Absorption' here means that the particle moves into DNP without 
regressing into transllcisl. Its behavior in DNP is described by the model pertaining to that 
section. The propagator function G(x, y, z, t) is given by a linear Fokker-Planck equation in three 
dimensions: 

3G/3t + v x dG/dx + v y dG/dy + v z dG/dz = D (d 2 G/dx 2 + d 2 G/dy 2 + d 2 G/dz 2 ) (12) 
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where v x , v y , and v z are the drift velocities in the x, y, and z directions, and D is the diffusion 
coefficient. In transllcisl there is no drift potential in the x and y directions (Figure 4b) so 

v x = v y = 0 (13) 

in Equation 12. 

The following initial value (I.V.) and boundary values (B.C.) apply: 

1) The particle is released at position (0, 0, 0) at time t = 0. This is represented by a delta 
function 5(x,y,z): 

I.V. G(0, 0, 0, t=0) = 5(x,y,z) = 5(x) 5(y) 5(z) (14) 

2) It is absorbed at the bottom of transllcis2 at t > 0: 

B.C.I G(x, y, L 23 =L, t) = 0 (15) 

3) It is reflected at the sides of transllcis2 at t > 0: 

B.C. 2 DdG(x,y, z, t)/dx | x = ±d/2 = 0 (16) 

B.C. 3 DdG(x,y,z,t)/dy| y = ±d /2 = 0 (17) 

3) It is reflected at the top of transl/cis2: 

B.C. 4 DdG(x,y, z, t)/5z | z = 0 = v z G(x, y, 0, t), t >0 (18) 

Since the initial value is a separable function of x, y, and z (Equation 14), the above boundary 
value problem in three dimensions can be considered mathematically as three boundary value 
problems [18], one in each dimension, and the propagator function viewed as the product of 
three independent propagator functions: 

G(x,y,z,t) = G x (x,t) G y (y,t) G z (z,t) (19) 

where 

G x (x,t) = (2/d) Zm=c? cos a m x/VD exp(-a m 2 t) (20) 

G y (y,t) = (2/d) Xn=o°° cos p n x/VD exp(-p n 2 t) (2 1 ) 



and 

with 
and 

N(w k ) = (D/v z )(exp(v z L/D)-l) - {(v z /D)(exp(v z L/D) - cos 2ro k L) - 2ro k sin 2w k L)}/((v z /D) 2 
+ 4w m 2 ) (24) 



G z (z,t) = (2D/L) exp(v z z/2D+v z 2 /4Dt) £k=i°° sin « k L sin w k (z-L) exp(-Dco m 2 t )/N(co k ) 

(22) 

a m = 2m7iVD/d p„ = 2mWD/d (23) 
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If detection is defined to occur when the particle reaches z = L, the first passage time is the time 
the particle crosses z=L at any x and y, -d/2 < x,y < d/2, so that its pdf cp(t) can be written as 

cp(t) = !-d/2 d/2 L/2 d/2 (-D dG(x,y,z,t)/dz | Z = L ) dx dy = L/ 2 G x (x,t) dx j. d/2 d/2 G y (y,t) dy cp z (t) 

(25) 

where 

cp z (t) = 2D exp(v z z/2D-v z 74Dt) Xk=r »k sin w k L exp(-Dco m 2 t )/N(co k ) (26) 

Similar to separation of the three-dimensional boundary value problem defined by Equations 
12-18 into three independent one-dimensional boundary value problems [18], one can consider 
in physical terms a similar separation of diffusive effects in the three directions. With free 
diffusion given by Equations 12-13 and only the initial condition in Equation 14, the diffusion 
has a spatial mean of (0, 0, 0) and is independent in the three directions. Adding the reflective 
boundaries z = 0, x = ±d/2, and y = ±d/2 (see Figure 2) and a positive drift potential (V23 > 0) 
causes the mean of the first passage time to z = L (which is an absorbing boundary, where 
detection is considered to occur for any x and y; -d/2 < x,y < d/2) to be less than the mean time 
when V23 = 0. Considering (p z (t) in isolation, its distribution is in effect the one-dimensional first 
passage time distribution with mean E(T = T z ) and standard deviation a = a z . 

To see if diffusion in the x and y directions has any effect on G(x,y,z,t) consider the factor |. d / 
2 d/2 G x (x,t) dx in Equation 25 (the behavior of G y (y,t) is identical owing to the symmetry in x and 
y). To compute it the method of images [19] can be used. Thus start without any boundary 
conditions on x, which corresponds to free diffusion in x. G x (x,t) is then given by the heat kernel: 

G x (x,t) = (l/(V(7i4Dt)) exp(-x7(4Dt)), -00 < x < 00 (27) 

This function is repeatedly reflected at x = ±d/2 resulting finally in 

G x (x, t) = (l/V(7i4Dt)[exp(-x74Dt) + Xk=i°° exp(-(x+kd)74Dt) + Xk=i°° exp(-(x-kd)74Dt)], 
-d/2 < x < d/2 (28) 

Because probability is conserved, the integral of G x (x, t) over -d/2 < x < d/2 is the area under the 
heat kernel function over -00 < x < 00, which is 1 . A similar result holds for G(y, t) by symmetry. 
Hence Equation (25) reduces to 

cp(t) = cp z (t) (29) 

Thus diffusion in the x and y dimensions does not affect the translocation time in the z 
dimension, assuming that arrival of the particle at any (x, y, z=L) is tantamount to detection. 
Figure 6 shows the dependence on pore voltage of the mean E(T) and standard deviation c(T) of 
the translocation time through transit cisl for L = 1 jam. 

Behavior at the interface between two sections 
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Consider the probability currents at some fixed point (x,y,L 2 3±) on either side of the interface 
transXI cisl-DN? . If there is an absorbing barrier at L23- then the probability function on the 
transXIcisl side would be 

G 3 (x,y,L 23 -,t) = 0 (30) 

On the DNP side if there is a reflecting barrier the probability current would be 

J 4 (x,y,L 23 +,t) = v z4 G 4 (x,y, L 23 +,t) - D 3G 4 (x,y,L 23 +,t)/3z = 0 (31) 

v Z 3 = M-V 23 /L 23 v z4 = [TV 34 /L 34 (32) 

But there is really no barrier. The particle oscillates at the interface because of diffusion, before 
eventually passing into DNP, such passage being aided directly by the positively directed drift 
potentials in both compartments (and indirectly by the reflecting boundaries in transXIcisl). Thus 

J 3 (x,y,L 23 -,t) = v z3 G 3 (x,y,L 23 -,t) - D 3G 3 (x,y,L 23 -,t)/3z ^ 0 (33) 

and 

J 4 (x,y,L 23 +,t) = v z4 G 4 (x,y, L 23 +,t) - D 3G 4 (x,y,L 23 +,t)/az # 0 (34) 
Continuity requires 

J 3 (x,y,L 23 -,t)=J 4 (x,y,L 23 +,t) (35) 

In order for the particle to translocate successfully through DNP in the z direction so that it can 
be detected inside DNP, the net probability current at L 23 must be in the positive z direction. This 
can be achieved with a sufficiently large V05. Thus 

J 34 (x,y,L 23 ,t) = J 3 (x,y,L 23 -,t) = J 4 (x,y,L 23 +,t) > 0 (36) 

The behavior at the interface between DNP and trans! is similar. 

The tapered geometry of transXIcisl in Figure 4 aids drift of the particle into DNP. It can be 
modeled with a Fokker-Planck equation just as in Equation 12 but with a trapezoidal frustum 
boundary. The resulting system of equations is not as easily solved as Equations 12 through 18 
although it is amenable to numerical solution. One obvious result is that the translocation time is 
decreased. Similar to the taper in transXIcisl aiding capture of the base at the entrance of DNP 
the abrupt increase in diameter from DNP to transl decreases the probability of a detected 
particle regressing into DNP from transl. One can also think of these two behaviors in terms of 
entropy barriers [3]: the taper in transXIcisl decreases the barrier for entry into DNP (below what 
it would be with a rectangular box), while the step change going from DNP to transl effectively 
increases the barrier for a base regressing into DNP. Also as noted in Section III an extra drift 
potential across transl can provide additional inducement for the detected particle to pass 
through transl where it is recovered. 
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Slowing down translocation 

The translocation of a base through DNP is too fast for the detector electronics. Methods to slow 
down translocation include the use of 'molecular brakes', magnetic or optical tweezers, 
alternative electrolytes such as LiCl, and increased salt concentration, see [3] for an overview. 
Here an approach based on the use of an electric field is proposed. 

Let the electric fields over the five sections of the tandem cell be E 0 i, En, E 23 , E 34 , and E 45 . 
Consider DNP in isolation. With a negative electric field E 34 over DNP of length L 34 =10 nm, D 
= 3 x 10" 10 m 2 /sec, and \i=2A x 10" 8 volt/m 2 sec, the data in Figure 5 show an increase in the mean 
translocation time, which indicates slowdown, but it is also accompanied by a significant 
increase in the variance. With V 34 approaching -0.25 V, the mean has increased by 7 orders of 
magnitude over the mean for V 34 = 0.25 V and the standard deviation is closely tracking the 
mean, indicating that diffusion has started to take over. 

For this approach to work: 1) A cleaved base entering DNP must not regress into transllcisl; 
2) A detected base exiting into trans2 must not regress into DNP; 3) The probability that there is 
more than one base in DNP must approach 0. To satisfy condition 1 a base moving from 
transllcisl into DNP has to experience a positive drift field at the interface. This requires that E 23 
and E34 both be positive. To satisfy condition 2 a base moving from DNP into trans! has to 
experience a positive drift field at the interface. This requires that E 34 and E 45 both be positive. 
Slowing down the base inside DNP requires E 34 to be negative. All three field sign conditions 
may be satisfied if L 34 is split into three parts L 34 .o 34 -i, L34-1 34-2, and L 34 .2 3 4-3 with respective 
electric fields E34-0 34-1, E34-1 34-2, and E34-2 34-3 such that E34-0 34-1 > 0, E34-1 34-2 < 0, and E34-2 34-3 > 0. 
Such an electric field profile is shown in Figure 7. 

The earlier analysis of DNP may be extended to the behavior of a base that experiences this 
kind of profiled potential in DNP. There is a tradeoff among the need to reduce the translocation 
speed through DNP, the need to prevent regression from DNP into transllcisl, and the need to 
prevent regression into DNP from trans 2. Let L34-0 34-1 = a L34, L34-1 34-2 = bL 3 4, and L 34 - 2 34-3 = 
CL34, with a + b + c = 1 . The first and second conditions require E34-0 34-1 and E34-2 34-3 both to be 
sufficiently positive. Since two electrodes are required to define the internal negative potential 
segment, each of a, b and c has a minimum value given by a min = b min = c min = e„ + e s , where e„ = 
width of electrode and e s = interelectrode spacing. This spacing along with the applied voltages 
V34-1 and V34-2 can be used to determine the span of the negative electric field over DNP (Figure 
7). (The voltages themselves need not be negative, it is the potential difference, and hence the 
corresponding electric field, that has to be negative.) 

With this modification DNP can be represented as [Si pore-electrode-Si pore-electrode-Si 
pore]. It may be possible to achieve the desired field profile if ultra thin graphene sheets (which 
have been studied for their potential use in strand sequencing) are used as electrodes [13]. Figure 
8 shows a schematic of the required modification, where voltages are applied to electrodes V 0 , 
V3-1, V3-2, and V 4 . Thus 49% of the potential difference V 0 3-1 drops across each of UNP and the 
segment L34-0 34-1 and 0.5% across each of cisl and transllcisl. A negative electric field exists 
across the segment L34-1 34-2 with V3-1 > V3-2. With V 4 > V3-2, 99% of the potential difference V3-2 4 
drops across the segment L 34 .2 34-3 and 1% across transl. 

The optimum electric field profile over DNP can be obtained by experiment. Here an 
estimate is obtained by using Equations 8 and 10 from the one-dimensional problem and 
ignoring the transitional behavior at the two ends. Let V 34 -o 34-1 = V 34 -i - V34-0 = V a , V34-1 34-2 = V b , 
and V 34 _2 34 . 3 = V c . With V a = V c = 0.1 volt and V b = -0.2 volt the mean and standard deviation of 
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the translocation time over each of the three segments of a nanopore of length L 34 =10 nm are 
shown in Table 1 for different values of a and b. The translocation over the segment [aL 34 , aL 34 + 
bL 34 ] is seen to be considerably slowed down by the negative field, which also dominates the 
total translocation time over DNP. 



Table 1 

Translocation times over positive and negative electric field segments of DNP 



a = c 
(positive field 
segment) 


Mean 
(10" 8 sec) 


Standard deviation 
( 10 s sec) 


b 

(negative field 
segment) 


Mean 
(10" 3 sec) 


Standard deviation 
(10 3 sec) 


0.1 


0.0365 


0.0173 


0.8 


7.405078 


7.405067 


0.2 


0.1458 


0.0691 


0.6 


4.165356 


4.165350 


0.3 


0.3281 


0.1556 


0.4 


1.851269 


1.851267 


0.4 


0.5834 


0.2765 


0.2 


0.4628174 


0.4628167 



Probability of bases arriving at DNP out of order 

Assume that bases are cleaved at a rate of one every T seconds. Without loss of generality let 
base 1 be cleaved at time t=0 and base 2 at t=T. Let T; = time for base i to diffuse-drift over 
trans\/cis2 and P = translocation time through the pore. T ; and P are independent random 
variables with pdfs fn(.) and f P (.), mean (j, T i=|^B and \i P , and standard deviation a T i=a B and c P 
respectively, where the TiS are assumed to be i.i.d. If the two pdfs are assumed to be unimodal [7, 
Figure 3] with finite support equal to six-a, the supports would respectively be [max (0, (j, T i-3a T i), 
|a,Ti+3a T i] and [max(0, (j, P -3o>), (j, P +3a P ]. The following sufficient condition holds for the two 
bases to arrive in order: 

T > (uri+3a T i) - max(0, [It2-3<j j2 ) (37) 

In [17] the turnover rate for exonuclease under normal conditions is 10 to 80 msecs. Setting 
T=10 msecs and V 23 =1.5 mV and interpolating over the data in Figure 6 gives [iji=[i T2 =l .6 
msecs, Cti=o"t2= 1.3 msecs. Using this in the inequality in Equation 37 gives 

10 > 1.6 + 3 x 1.3 - max (0, 1.6 - 3 x 1.3) = 5.4 

This means that the bases arrive sequentially at DNP. Using similar arguments, it can be shown 
that detected bases passing into and through trans! do so in their natural order. 

Probability of two bases being inside DNP at the same time 

Let bases be cleaved every T seconds. For bases 1 and 2 to be inside the pore at the same time 
base 2 must arrive in the pore before base 1 exits the pore. The condition for two bases not being 
in the pore at the same time is then obtained as 

T + max(0,|a,T2-3o"T2) > Hti+3o T i + [x P +3a P (38) 
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With T=10 msecs, V 3 4=200 mV, V 2 3= 1.6 mV, Hti = |J.t2 = 1.6 msecs, a T i=a T 2 = 1.3 msecs, 
|4, P =0. 000013 msec, and a P =0. 000004 msec. Using these numbers in Equation 38 

10 + max(0, 1.6 - 3 x 1.3) = 10 > 1.6 + 3 x 1.3 + 0.000013 + 3 x 0.000004 

The two bases cannot be in the pore at the same time. The bandwidth required would be on the 
order of 4 Mhz. 

Conversely, the minimum interval required between the release of two successive cleaved 
bases on the exit side of UNP so that they do not occupy DNP at the same time is given by 

T min = 3c T i + 3c T 2 + Hp + 3c P (39) 

Using c T i= ct2= 1.6 msecs, [i P = 0.000013 msec, and o>= 0.000004 msec 

T min = 3 x 1.6 + 3 x 1.6 + 0.000013 + 3 x 0.000004 = 9.600025 msecs 

In [5], T is given to be 1 to 10 msecs (compare with 10 to 80 msecs in [17]). With a suitable set 
of controls (temperature, salt concentration, etc.) the enzyme can be set to cleave bases at a rate 
that is < 1 every, say, 20 msecs. 

With a negative electric field over part of DNP a rough bound can be obtained for the 
probability of two bases being in DNP at the same time. Consider for example L 34 =10 nm, b = 
0.4, V b = -0.15 volt, and V23 = 1.6 mV. From the data in Figures 5 and 6 and Table 1, c T i = a T 2 = 
1.6 msecs, (j, P ~ 0.38 msec, and o> ~ 0.38 msec. Using Equation 39 

Tata = 3 x 1.6 + 3 x 1.6 + 0.38 + 3 x 0.38 = 11.2 msecs 

This is within the range of turnover rates achieved with exonuclease as given in [17]. With a 
mean translocation time of 0.38 msec through a distance of L 34 _ 2 = bL 34 = 4 nm, the detector 
bandwidth required would be on the order of 5 Khz, which comes close to the 1 msec 
translocation time criterion [4] for the effective detection of a base inside a nanopore. 

V IMPLEMENTATION ISSUES 

The tandem cell assumes a hybrid implementation with a biological nanopore for UNP and a 
synthetic one for DNP. A number of implementation issues are considered below. 

Positioning the enzyme 

The enzyme on the exit side of UNP need only be engineered so that it is in the path of the 
threading DNA sequence such that the first base of the remaining sequence is presented to it. If 
the leading base is not cleaved then it would mean that the ssDNA has either stopped moving or 
has slipped past the enzyme without being cleaved at all (since cleaving can only occur of the 
leading base). Failure to detect the characteristic pulses for the four (or more, if modified bases 
are considered) base types would indicate that no cleaving has occurred. 

Voltage drift 

With an ion-selective DNP pore, ion currents, which are typically on the order of a few 100 pA, 
can lead to an electro-osmotic potential which with the passage of time can cause buildup of 
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charge in the pore and lead to the pore voltage drifting over time. Methods commonly used in 
electronic measurements may be used to solve the voltage drift problem. One of these is to use a 
stable reference voltage against which the drift is tracked and the difference subtracted from the 
recorded data (similar to the moving average in statistical trend analysis of time series data). 
Alternatively the transllcisl and trans! compartments and DNP can be drained periodically and 
refilled with electrolyte. To prevent the occurrence of deletion errors due to cleaved bases still in 
transit through transllcisl while draining is taking place, the draining step may be preceded by 
retraction of the strand in UNP (achieved by temporarily lowering or reversing the potential V05) 
and pausing until the cleaved bases in transit have passed into DNP and been detected through 
their characteristic blockade levels. 

Cleaved bases sticking to the side wall of transllcisl 

A cleaved base may stick to a side wall while diffusing inside transllcisl. The probability of this 
event can be calculated using the model in Section IV. Bases can be prevented from sticking by 
holding the side wall of transllcisl at a slightly negative potential (the appropriate value to use 
can be determined experimentally) with respect to Vi. This effectively creates a reflecting wall 
for the negatively charged base. Alternatively the compartment walls may be chemically treated 
to prevent sticking. For example, the use of a lipid coat to control clogging of a translocating 
protein in a solid-state nanopore is described in [20]. 

Solid state pore for DNP 

A solid state pore has the advantages of scaling and integration in fabrication and has been 
studied widely both experimentally and theoretically in the context of DNA sequencing. While 
such pores are not useful in strand sequencing for single-nucleotide discrimination because of 
their thickness (currently they have a minimum thickness of 20 nm and an hourglass shape with 
actual pore thickness of 10 nm [10]), with exonuclease sequencing using a tandem cell this may 
not be a problem because of the near zero probability of two nucleotides being in the nanopore at 
the same time, as shown above. They are also easier to fabricate because of the relaxed 
tolerances on both electrode width and electrode spacing, especially an advantage if a negative 
electric field over DNP is used to slow down translocation. 

Negative field over DNP, graphene electrodes, graphene pores 

Implementing a negative field over part of DNP could take the form of two graphene sheets 
interposed between three layers of silicon pores and acting as electrodes across which the 
negative field is applied. The resulting DNP would then have the structure [Si pore-graphene 
electrode-Si pore-graphene electrode-Si pore]. See Figure 8. 

A graphene nanopore by itself can be used for exonuclease sequencing with a tandem cell. 
The cell would then have the structure [cisl, UNP, transl=cisl, DNP, transl], where DNP is [Si 
pore-graphene electrode-Si pore-graphene sheet-Si pore-graphene electrode-Si pore]. The 
translocating cleaved base is then identified by the level of the transverse current through the 
graphene sheet [13]. 

Negative fields may also be useful for translocation slowdown in strand sequencing, for 
which nanopores in graphene nanoribbons have been investigated recently [11-13]. The 
translocation speed can be controlled by embedding the graphene layer in the negative- field 
segment of a solid-state nanopore with an electric field profile similar to that in Figure 7. A cell 
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would then have the structure [czs-Si pore-graphene electrode-Si pore-graphene sheet-Si pore- 
graphene electrode-Si pore-trans], where the large thickness of the Si pore, by itself a 
disadvantage in sequencing, is no longer important. 

In both strand and exonuclease sequencing based on the above approach, the [oA-level 
transverse currents through the graphene sheet combined with the negative-field-based 
slowdown of strand or cleaved base could provide an efficient sequencing method with a lower 
detection bandwidth and S/N ratios that are significantly higher than with a method that relies on 
discrimination based only on axial ionic pore currents of -100 pA. 

VI DISCUSSION 

1. With a tandem cell, repeat bases (homopolymers) do not present a problem because of the 
time and space separation between successive cleaved bases. 

2. An arbitrary number of tandem cells could be implemented in parallel. With a sequencing rate 
of 100/second, an array of 10000 cells can potentially sequence a billion (10 9 ) bases in ~16 
minutes. 

3. A pipeline of tandem cells with a cis trans-cis ... trans-cis trans structure may be used for 
error checking and/or to obtain upstream-downstream correlations in real time. With an N-stage 
pipeline Nx coverage is possible, requiring, at least in principle, little more time than the time 
needed for 1 x sequencing. 

4. A recent report describes the use of heavy tags attached to mononucleotides that are used by a 
processive enzyme to synthesize DNA threaded through the enzyme [21]. When a nucleotide is 
added to the growing DNA strand, the tag is cleaved and drops through a nanopore causing a 
blockade event that is unique for each of four different tag types. If the tags could be attached to 
bases in ssDNA, this method could be adapted for use with the tandem-cell method proposed 
here. The heavier base-specific tags could lead to better discrimination among the base types. 

5. A modified version of the tandem cell may be considered for polypeptide sequencing. Such a 
structure would be more complicated than the one for DNA sequencing because: a) peptides can 
be positively or negatively charged or charge neutral; and b) different peptidases have to be used 
for different types of peptides. This will require cycling the shrinking polypeptide through a 
pipeline of tandem cells one for each type of peptide as well as an algorithm to assemble the 
sequence from the signal data consisting of multiple time series, one for each peptide type. 
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Figure Captions and Figures 

Figure 1. Schematic of nanopore DNA sequencing: (a) strand sequencing (b) exonuclease 
sequencing. AHL nanopore -10 nm long has broad vestibule (length 5 nm) and narrow 
barrel/stem (length 5 nm, smallest diameter 2 nm). Pore is embedded in lipid bilayer (membrane) 
~5 nm thick. 

Figure 2. Schematic of proposed tandem cell, (a) Tandem cell with five sections or stages in 
pipeline: cisl, UNP (depicted here as an AHL pore), trans! (= cis2), DNP (depicted here as a 
solid-state pore), and transl. Dimensions considered in the text are: 1) cisl: box of height 1 jjm 
and cross section 1 \im 2 ; 2) UNP: AHL pore of length 8 nm and diameter 2 nm; 3) transllcisl: 
tapered box of length 1 mri tapering from cross-section of 1 jam 2 to 4 nm 2 ; 4) DNP: solid-state 
pore of length 10-20 nm and diameter 2 nm; 5) transl: box of height 1 jjm and cross-section 1 
(j,m 2 . (b) Voltage profile. Electrodes assumed inserted at top of cisl and trans! . Negative field 
across graphene electrodes inserted laterally into DNP slows down translocating base (see text 
and Figure 8). 

Figure 3. Tandem cell with reinforcing potential difference in transl. Two electrodes in 
transl, instead of one; used to apply reinforcing drift field to prevent detected base from 
regressing into DNP. 

Figure 4. Coordinate systems for models. Different coordinate systems used for (a) Stage 4, 
(b) Stage 3, and (c) Stage 3. Dimensions used: (a) L= 8-10 nm; (b) L = 1 \im, w p = 1 \im; (c) L t = 
1 \xm, w t = 1 \xm, w p = 2 nm. 

Figure 5. Translocation statistics of DNP. Mean and standard deviation of time for particle to 
translocate from time of entry into DNP (negligible cross-section and length L = 8-10 nm) to 
time of exit into transl. Parameter values used: mononucleotide mobility \i = 2.4 x 10" 8 m 2 /volt- 
sec, diffusion constant D = 3 x 10" 10 m 2 /sec. Calculations are for typical absolute potential 
difference in the range 0.1-0.3 volt [5]. Negative field across DNP results in markedly decreased 
translocation times, sufficient to reduce detection bandwidth significantly. 

Figure 6. Translocation statistics of transl/cisl. Mean and standard deviation of translocation 
time for particle (cleaved base) released by exonuclease at top of transl /cisl (= 3 -dimensional 
box with height 1 urn and cross-section 1 |J,m 2 ) to move to entrance of DNP. Parameter values 
used: mononucleotide mobility jj. = 2.4 x 10" 8 m 2 /volt-sec, diffusion constant D = 3 x 10" 10 
m 2 /sec. Calculations for cell voltages of 0.1-0.3 volt, with -1-2 mV dropping across transl/cisl. 

Figure 7 Voltage and electric field profiles over DNP. Example of a profiled voltage in which 
the pore length is divided into three segments L34-1, L34-2, and L34-3, with lengths L34-1 + L34-2 + L34-3 
= L34. Electric field is positive over L34-1 and L34-3, negative over L34-2. (The voltages themselves 
need not be negative. Thus V34-1 - V34-0 > 0, V34-3 - V34-2 > 0, and V34-2 - V34-1 < 0.) Also V34-0 > V23 
(voltage across transllcisl) and V34-3 < V45 (voltage across transl). 

Figure 8. Tandem cell modified for negative field over segment of DNP. Graphene 
electrodes laterally inserted into DNP. Negative field applied over middle segment of DNP 
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through graphene electrodes set to voltages V3-1 and V3-2. Thus V 0 < V3-1, V3-2 < V3-1, V3-2 < V 5 , 
where V 0 and V 5 represent electrodes in cisl and trans! 
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